Layer-wise Learning of Stochastic Neural Networks with Information Bottleneck
نویسندگان
چکیده
In this paper, we present a layer-wise learning of stochastic neural networks (SNNs) in an information-theoretic perspective. In each layer of an SNN, the compression and the relevance are defined to quantify the amount of information that the layer contains about the input space and the target space, respectively. We jointly optimize the compression and the relevance of all parameters in an SNN to better exploit the neural network’s representation. Previously, the Information Bottleneck (IB) framework ([28]) extracts relevant information for a target variable. Here, we propose Parametric Information Bottleneck (PIB) for a neural network by utilizing (only) its model parameters explicitly to approximate the compression and the relevance. We show that, the PIB framework can be considered as an extension of the maximum likelihood estimate (MLE) principle to every layer level. We also show that, as compared to the MLE principle, PIB : (i) improves the generalization of neural networks in classification tasks, (ii) is more efficient to exploit a neural network’s representation by pushing it closer to the optimal information-theoretical representation in a faster manner.
منابع مشابه
An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملWavelet Neural Network with Random Wavelet Function Parameters
The training algorithm of Wavelet Neural Networks (WNN) is a bottleneck which impacts on the accuracy of the final WNN model. Several methods have been proposed for training the WNNs. From the perspective of our research, most of these algorithms are iterative and need to adjust all the parameters of WNN. This paper proposes a one-step learning method which changes the weights between hidden la...
متن کاملPrediction of breeding values for the milk production trait in Iranian Holstein cows applying artificial neural networks
The artificial neural networks, the learning algorithms and mathematical models mimicking the information processing ability of human brain can be used non-linear and complex data. The aim of this study was to predict the breeding values for milk production trait in Iranian Holstein cows applying artificial neural networks. Data on 35167 Iranian Holstein cows recorded between 1998 to 2009 were ...
متن کاملUnderstanding Autoencoders with Information Theoretic Concepts
Despite their great success in practical applications, there is still a lack of theoretical and systematic methods to analyze deep neural networks. In this paper, we illustrate an advanced information theoretic methodology to understand the dynamics of learning and the design of autoencoders, a special type of deep learning architectures that resembles a communication channel. By generalizing t...
متن کاملTraining Deep Neural Networks for Bottleneck Feature Extraction
In automatic speech recognition systems, preprocessing the audio signal to generate features is an important part of achieving a good recognition rate. Previous works have shown that artificial neural networks can be used to extract good, discriminative features that yield better recognition performance than manually engineered feature extraction algorithms. One possible approach for this is to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1712.01272 شماره
صفحات -
تاریخ انتشار 2017